巴西专利BR112019019702A2 transformation method into an image coding system and apparatus for the same

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
a transform method according to the present invention comprises the steps of: obtaining transform coefficients for a target block; determine a set of non-separable secondary transform (nsst) for the target block; selecting one of a plurality of nsst cores included in the nsst pool based on an nsst index; and generate transform coefficients modified by the non-separable secondary transformation of the transform coefficients based on the nsst core that was selected, where the nsst set for the target block is determined based on an intra prediction mode and / or the size of the target block. according to the present invention, the amount of data transmitted, which is necessary for the residual processing, can be reduced and the residual coding efficiency can be increased.
公开号:BR112019019702A2
申请号:R112019019702
申请日:2018-01-31
公开日:2020-04-14
发明作者:Jang Hyeongmoon；Lim Jaehyun；Kim Seunghwan
申请人:Lg Electronics Inc；
IPC主号:

专利说明:

“METHOD OF TRANSFORMED INTO IMAGE CODING SYSTEM AND APPARATUS FOR THE SAME”
BACKGROUND OF THE INVENTION
Technical Field [001] The present modality relates to an image encoding technology, and, more particularly, to an image encoding system, a transform method and an apparatus.
Related Technique [002] The demand for high resolution and high quality images, such as HD (High Definition) and UHD (Ultra High Definition) images, has increased in several areas. Since the image data has high resolution and high quality, the amount of information or bits to be transmitted increases compared to the image data from previous versions. Therefore, when image data is transmitted using a medium, such as a conventional wired or wireless broadband line, or when image data is stored using an existing storage medium, transmission cost and storage cost.
[003] Consequently, there is a need for an image compression technique with high efficiency to effectively transmit, store and reproduce high resolution and high quality image information.
SUMMARY [004] The present modality provides a method and apparatus to improve the efficiency of image coding.
[005] The present modality also provides a method and apparatus to improve transform efficiency.
[006] The present modality also provides a method and apparatus to improve the efficiency of residual coding based on a multitransform.
Petition 870190117153, of 11/13/2019, p. 8/53
2/39 [007] The present modality also provides a non-separable secondary transform method and apparatus.
[008] In one aspect, a transform method provided by a decoding apparatus is provided. The method includes obtaining transform coefficients for a target block, determining a non-separable secondary transform set (NSST) for the target block, selecting one from a plurality of NSST cores included in the NSST set based on an NSST index, and generating transform coefficients modified by the non-separable secondary transformation of the transform coefficients based on the selected NSST core. The NSST set for the target block is determined based on at least one of an intra prediction mode and a target block size.
[009] In accordance with another embodiment of the present invention, a decoding apparatus providing a transform is provided. The decoding apparatus includes a decanting unit configured to obtain transform coefficients for a target block by performing quantizing transform coefficients of the target block, and an inverse transformer configured to determine a non-separable secondary transform set (NSST) for the target block, select one of a plurality of NSST cores included in the NSST set based on an NSST index, and generate transform coefficients modified by the non-separable secondary transformation of the transform coefficients based on the selected NSST core. The reverse transformer determines the NSST set for the target block based on at least one of an intra prediction mode and a target block size.
[010] In accordance with yet another embodiment of the present invention, a transform method provided by a coding apparatus is provided. The method includes obtaining transform coefficients for a target block, determining a set of non-separable secondary transform (NSST) for the target block, selecting
Petition 870190117153, of 11/13/2019, p. 9/53
3/39 selecting one of a plurality of NSST cores included in the NSST set, defining an NSST index, and generating transform coefficients modified by the secondary transformation not separable from the transform coefficients based on the selected NSST core. The NSST set for the target block is determined based on at least one of an intra prediction mode and the size of the target block.
[011] In accordance with yet another embodiment of the present invention, a coding apparatus providing a transform is provided. The coding apparatus includes a transformer configured to obtain transform coefficients for a target block by performing a primary transform on the residual samples of the target block, determining a set of non-separable secondary transform (NSST) for the target block, selecting one of a plurality of NSST cores included in the NSST set based on an NSST index, and generate transform coefficients modified by the secondary transformation not separable from the transform coefficients based on the selected NSST core. The transformer determines the NSST set for the target block based on at least one of an intra prediction mode and the size of the target block.
[012] According to the present modality, it is possible to improve the overall efficiency of image / video compression.
[013] According to the present modality, the amount of data needed for residual processing can be reduced and the residual coding efficiency can be improved through an efficient transform.
[014] According to the present modality, transform coefficients other than 0 can be concentrated in a low frequency component through a secondary transform in a frequency domain.
[015] According to the present modality, the efficiency of the transform can be improved by applying a transform core in a variable / adaptive way when performing a non-separable secondary transform.
Petition 870190117153, of 11/13/2019, p. 10/53
4/39
BRIEF DESCRIPTION OF THE DRAWINGS [016] FIG. 1 is a schematic diagram illustrating a configuration of a video encoding device to which the present embodiment is applicable.
[017] FIG. 2 is a schematic diagram illustrating a configuration of a video decoding device to which the present embodiment is applicable.
[018] FIG. 3 schematically illustrates a multitransform scheme according to the present embodiment.
[019] FIG. 4 illustrates 65 intra direction modes of a prediction mode.
[020] FIG. 5 illustrates a method of determining an NSST set based on an intra prediction mode and a block size.
[021] FIG. 6 schematically illustrates an example of a video / image encoding method including a transform method in accordance with the present embodiment.
[022] FIG. 7 schematically illustrates an example of a video / image decoding method including a transform method in accordance with the present embodiment.
DESCRIPTION OF ILLUSTRATIVE MODALITIES [023] This modality can be modified in several ways, and specific examples of it will be described and illustrated in the drawings. However, the examples are not intended to limit the modality. The terms used in the description below are used to merely describe specific examples, but are not intended to limit the modality. An expression of a singular number includes an expression of the plural number, as long as it is read clearly differently. Terms such as “include” and “own” are intended to indicate that the aspects, numbers, steps, operations, elements, components, or combinations thereof used in the description below exist, and it should be understood that the possibility of the existence or addition of a or more aspects, numbers, steps, operations, it
Petition 870190117153, of 11/13/2019, p. 11/53
5/39 different components, components or combinations are not excluded.
[024] Meanwhile, the elements in the drawings described in the modality are illustrated independently for convenience of explaining different specific functions, and do not mean that the elements are incorporated by independent hardware or independent software. For example, two or more elements among the elements can be combined to form a single element, or an element can be divided into several elements. The modalities in which the elements are combined and / or divided belong to the modality without diverging from the concept of the modality.
[025] Hereafter, examples of the present modality will be described in detail with reference to the accompanying drawings. In addition, similar reference numerals are used to indicate similar elements throughout the drawings, and the same descriptions for similar elements will be omitted.
[026] In this specification, generally an image means a unit representing an image at a specific time, a unit is a unit constituting a part of the image. An image can be made up of several slices, and the terms of an image and a slice can be mixed with each other as the occasion demands.
[027] A pixel or “pel” (image element) can represent a minimum unit constituting an image. In addition, the term sample can be used as corresponding to a pixel. The sample can generally represent a pixel or a pixel value, it can represent only one pixel (a pixel value) of a luma component, and it can represent only one pixel (a pixel value) of a chroma component.
[028] One unit indicates a basic image processing unit. The unit can include at least one from a specific area and information related to the area. Optionally, the unit can be mixed with terms such as
Petition 870190117153, of 11/13/2019, p. 12/53
6/39 m block, area, or the like. In a typical case, an MxN block can represent a set of samples or transform coefficients arranged in M columns and N rows.
[029] FIG. 1 briefly illustrates a structure of a video encoding apparatus to which the present embodiment is applicable.
[030] Referring to FIG. 1, a video encoding apparatus 100 may include an image partitioner 105, a predictor 110, a residual processor 120, an entropy encoder 130, an adder 140, a filter 150, and a memory 160. Residual processor 120 may include a subtractor 121, a transformer 122, a quantizer 123, a reordering 124, a decantant 125, an inverse transformer 126.
[031] Image partitioner 105 can divide an input image into at least one processing unit.
[032] In one example, the processing unit can be referred to as a coding unit (Cll). In this case, the coding unit can be divided recursively from the largest coding unit (LCU) according to a quaternary-binary tree structure (QTBT). for example, a coding unit can be divided into a plurality of coding units of greater depth based on a quaternary tree structure and / or a binary tree structure. In this case, for example, the quaternary tree structure can be applied first and the binary tree structure can be applied later. Alternatively, the binary tree structure can be applied first. The coding procedure according to the present modality can be carried out on the basis of a final coding unit that is not further divided. In this case, the largest coding unit can be used as the final coding unit based on the coding efficiency, among others, depending on the characteristics of the image, or the coding unit can be different.
Petition 870190117153, of 11/13/2019, p. 13/53
7/39 recursively shown in lower depth coding units as needed and a coding unit having an ideal size can be used as the final coding unit. Here, the coding procedure can include a procedure, such as prediction, transformation and reconstruction, which will be described later.
[033] In another example, the processing unit may include a coding unit (Cll), prediction unit (PU), or a transform unit (TU). The coding unit can be divided from the largest coding unit (LCU) into coding units of greater depth according to the quaternary tree structure. In this case, the largest coding unit can be used directly as the final coding unit based on the coding efficiency, among others, depending on the characteristics of the image, or the coding unit can be recursively divided into coding units of greater depth as needed and a coding unit having an ideal size can be used as a final coding unit. When the smallest coding unit (SCU) is defined, the coding unit may not be divided into smaller coding units than the smallest coding unit. Here, the final coding unit refers to a coding unit that is partitioned or divided into a prediction unit or a transform unit. The prediction unit is a unit that is partitioned from an encoding unit, and can be a sample prediction unit. Here, the prediction unit can be divided into sub-blocks. The transform unit can be divided from the coding unit according to the quaternary tree structure and can be a unit to derive a transform coefficient and / or a unit to derive a residual signal from the transform coefficient. Hereinafter, the coding unit can be referred to as a coding block (CB), the prediction unit can be referred to as
Petition 870190117153, of 11/13/2019, p. 14/53
8/39 a prediction block (PB), and the transform unit can be referred to as a transform block (TB). The prediction block or prediction unit can refer to a specific area in the form of a block in an image and include an array of prediction samples. In addition, the transform block or transform unit can refer to a specific area in the form of a block in an image and include the transform coefficient or a matrix of residual samples.
[034] Predictor 110 can perform prediction on a target processing block (hereinafter, a current block), and can generate a predicted block including prediction samples for the current block. A prediction unit performed on predictor 110 can be a coding block, or it can be a transform block, or it can be a prediction block.
[035] Predictor 110 can determine whether intra-prediction is applied or interpretition is applied to the current block. For example, predictor 110 can determine whether intra-prediction or inter-prediction is applied to the CU unit.
[036] In the case of intra-prediction, predictor 110 can derive a prediction sample for the current block based on a reference sample outside the current block in an image to which the current block belongs (hereinafter, an image current). In this case, the predictor 110 can derive the prediction sample based on an average or interpolation of the adjacent reference samples from the current block (case (i)), or it can derive the prediction sample based on an existing reference sample in a specific direction (prediction) for a prediction sample among the adjacent reference samples of the current block (case (ii)). Case (i) can be called non-directional or non-angular mode, and case (ii) can be called either directional or angular mode. In intra-prediction, prediction modes can include, as an example, 33 directional modes and at least two non-directional modes. Non-directional modes can include DC mode and planar mode. Predictor 110 can determine the prediction mode to be applied
Petition 870190117153, of 11/13/2019, p. 15/53
9/39 to the current block using the prediction mode applied to the adjacent block.
[037] In the case of inter-prediction, predictor 110 can derive the prediction sample for the current block based on a sample specified by a motion vector in a reference image. Predictor 110 can derive the prediction sample for the current block by applying either a jump mode, a combination mode, and a motion vector prediction (MVP) mode. In the case of hop mode and combination mode, predictor 110 can use motion information from the adjacent block as motion information from the current block. In the case of the jump mode, unlike the combination mode, a difference (residual) between the prediction sample and an original sample is not transmitted. In the case of MVP mode, a motion vector from the adjacent block is used as a motion vector predictor, and thus is used as a motion vector predictor from the current block to derive a motion vector from the current block.
[038] In the case of inter-prediction, the adjacent block can include an adjacent spatial block in the current image and an adjacent temporal block in the reference image. The reference image including the adjacent time block can also be called a co-located image (colPic). Motion information can include the motion vector and a reference image index. Information, such as prediction mode information and motion information, can be encoded (by entropy), and then output as a form of a bit stream.
[039] When motion information from an adjacent time block is used in jump mode and in combination mode, a higher image in a reference image list can be used as a reference image. Reference images included in the reference image list can be aligned based on an image order count (POC) difference
Petition 870190117153, of 11/13/2019, p. 16/53
10/39 between a current image and a corresponding reference image. A POC corresponds to a display order and can be broken down from a coding order.
[040] Subtractor 121 generates a residual sample which is a difference between an original sample and a prediction sample. If skip mode is applied, the residual sample may not be generated as described above.
[041] Transformer 122 transforms residual samples into units of a transform block to generate a transform coefficient. Transformer 122 can perform transformation based on the size of a corresponding transform block and a prediction mode applied to a coding block or prediction block spatially overlaid with the transform block. For example, residual samples can be transformed using the discrete sine transform transform (DST) core if intra-prediction is applied to the coding block or the prediction block overlaid with the transform block and the transform block is a residual matrix of 4x4 and is transformed using the discrete cosine transform (DCT) core in other cases.
[042] Quantizer 123 can quantize the transform coefficients to generate quantized transform coefficients.
[043] Reordering 124 reorders the quantized transform coefficients. Reorderer 124 can reorder the quantized transform coefficients in the form of a block in a one-dimensional vector using a coefficient scanning method. Although reordering 124 is described as a separate component, reordering 124 can be a part of quantizer 123.
[044] The entropy encoder 130 can perform entropy coding on the quantized transform coefficients. Entropy coding can include a coding method, for example, exponential Golomb,
Petition 870190117153, of 11/13/2019, p. 17/53
11/39 context-adaptive variable length (CAVLC), a context-adaptive binary arithmetic coding (CABAC), or similar. The entropy encoder 130 can carry out entropy coding or coding by a predetermined method together or separately on information (e.g., a syntax element value or the like) required for video reconstruction in addition to the quantized transform coefficients. Entropy-encoded information can be transmitted or stored in the unit of a network abstraction layer (NAL) in a bit stream form.
[045] Decantant 125 decantant values (transform coefficients) quantized by quantizer 123 and inverse transformer 126 inversely transforms values decantant by decantant 125 to generate a residual sample.
[046] Adder 140 adds a residual sample to a prediction sample to reconstruct an image. The residual sample can be added to the prediction sample in units of a block to generate a reconstructed block. Although adder 140 is described as a separate component, adder 140 can be a part of predictor 110. Meanwhile, adder 140 can be referred to as a reconstructed or reconstructed block generator.
[047] Filter 150 can apply the deblocking filter and / or an adaptive sample shift to the reconstructed image. Artifacts at a block boundary in the reconstructed image or the distortion in quantization can be corrected through the block effect removal filter and / or the adaptive sample shift. Adaptive sample displacement can be applied in units of a sample after the block removal filter is complete. Filter 150 can apply an adaptive mesh filter (ALF) to the reconstructed image. ALF can be applied to the reconstructed image to which the block effect removal filter and / or the adaptive sample offset has been applied.
Petition 870190117153, of 11/13/2019, p. 18/53
12/39 [048] Memory 160 can store a reconstructed image (decoded image) or information needed for encoding / decoding. Here, the reconstructed image can be the reconstructed image filtered by filter 150. The stored reconstructed image can be used as a reference image for (inter) prediction of other images. For example, memory 160 can store (reference) images used for inter-prediction. Here, images used for interpretation can be designated according to a set of reference images or a list of reference images.
[049] FIG. 2 briefly illustrates a structure of a video decoding device to which the present modality is applicable.
[050] Referring to FIG. 2, a video decoding apparatus 200 may include an entropy decoder 210, a residual processor 220, a predictor 230, an adder 240, a filter 250, and a memory 260. Residual processor 220 may include a reorder 221, a decoder 222, an inverse transformer 223. Although not illustrated in the drawings, the video decoding apparatus 200 may include a receiving unit for receiving a bit stream including video information. The receiving unit can be configured as a separate module or can be included in the entropy decoder 210.
[051] When a bit stream including video information is fed, the video decoding apparatus 200 can reconstruct a video in association with a process by which the video information is processed in the video encoding apparatus.
[052] For example, the video decoding apparatus 200 can perform video decoding using a processing unit applied to the video encoding apparatus. Therefore, the video decoding processing unit block can be, for example, an encoding unit, and, in another example, an encoding unit, a prediction unit or a unit
Petition 870190117153, of 11/13/2019, p. 19/53
13/39 transformed. The coding unit can be divided from the largest coding unit according to the quaternary tree structure and / or the binary tree structure.
[053] A prediction unit and a transform unit can be additionally used in some cases, in which case the prediction block is a derivative or partitioned block from the coding unit and can be a sample prediction unit. Here, the prediction unit can be divided into sub-blocks. The transform unit can be divided from the coding unit according to the quaternary tree structure and can be a unit that derives a transform coefficient and / or a unit that derives a residual signal from the transform coefficient.
[054] The entropy decoder 210 can analyze the bit stream to generate information needed for video reconstruction or image reconstruction. For example, the entropy decoder 210 can decode information in the bit stream based on an encoding method, such as exponential Golomb encoding, CAVLC, CABAC, or the like, and can generate a value of a syntax element needed for reconstruction of video and a quantized value of a transform coefficient with respect to a residual.
[055] More specifically, a CABAC entropy decoding method can receive a binary (“bin”) corresponding to each syntax element in a bit stream, determine a context model using decoding the target syntax element information and decoding adjacent block information and target block decoding or symbol / torque information decoded in a previous step, predict the probability of torque generation according to the given context model and perform arithmetic decoding of the torque to generate a symbol corresponding to each syntax element value. Here, the CABAC entropy decoding method can update the mo
Petition 870190117153, of 11/13/2019, p. 20/53
14/39 context model using information from a decoded symbol / binary to a context model of the next symbol / binary after determining the context model.
[056] Prediction information among the information decoded in the entropy decoder 210 can be provided to the predictor 250 and the residual values, that is, the quantized transform coefficients, in which the entropy decoding was performed by the entropy decoder 210, can be transmitted to the 221 reorder.
[057] The 221 reorder can reorder the quantized transform coefficients in a two-dimensional block form. Reorderer 221 can perform reordering corresponding to the coefficient scan performed by the coding device. Although reordering 221 is described as a separate component, reordering 221 can be a part of deconverter 222.
[058] Decantant 222 can decantize the quantized transform coefficients based on a (de) quantization parameter to generate a transform coefficient. In this case, the information to derive a quantization parameter can be signaled from the coding device.
[059] Inverse transformer 223 can reverse transform transform coefficients to derive residual samples.
[060] Predictor 230 can perform prediction on a current block, and can generate a predicted block including prediction samples for the current block. A prediction unit performed on predictor 230 can be a coding block, or it can be a transform block, or it can be a prediction block.
[061] Predictor 230 can determine whether to apply intra-prediction or interpredition based on information about a prediction. In this case, a unit to determine which will be used between intra-prediction and inter-prediction can be different from a unit to generate a prediction sample. In addition, a
Petition 870190117153, of 11/13/2019, p. 21/53
15/39 unit to generate the prediction sample can also be different in interpredition and intra-prediction. For example, which will be applied between inter-prediction and intra-prediction can be determined in the Cll unit. Additionally, for example, in the inter-prediction, the prediction sample can be generated by determining the prediction mode in the PU unit, and in the intra-prediction, the prediction sample can be generated in the TU unit using the determining the prediction mode in the PU unit.
[062] In the case of intra-prediction, predictor 230 can derive a prediction sample for a current block based on an adjacent reference sample in a current image. Predictor 230 can derive the prediction sample for the current block by applying a directional or non-directional mode based on the adjacent reference sample of the current block. In this case, a prediction mode to be applied to the current block can be determined using an intra-prediction mode of an adjacent block.
[063] In the case of inter-prediction, predictor 230 can derive a prediction sample for a current block based on a sample specified in a reference image according to a motion vector. Predictor 230 can derive the prediction sample for the current block using one of the jump mode, the combination mode and the MVP mode. Here, the movement information necessary for inter-prediction of the current block provided by the coding apparatus, for example, a motion vector and information about a reference image index can be obtained or derived based on the information about the prediction.
[064] In jump mode and in combination mode, the movement information of an adjacent block can be used as movement information for the current block. Here, the adjacent block can include an adjacent space block and an adjacent time block.
[065] Predictor 230 can build a list of combination candidates
Petition 870190117153, of 11/13/2019, p. 22/53
16/39 using movement information from adjacent blocks and use information indicated by a combination index in the list of combination candidates as a movement vector of the current block. The combination index can be signaled by the coding device. The motion information can include a motion vector or a reference image. When motion information from an adjacent time block is used in jump mode and in combination mode, a higher image in a reference image list can be used as a reference image.
[066] In the case of jump mode, a (residual) difference between a prediction sample and an original sample is not transmitted, distinguished from the combination mode.
[067] In the case of MVP mode, the motion vector of the current block can be derived using a motion vector from an adjacent block as a motion vector predictor. Here, the adjacent block can include an adjacent space block and an adjacent time block.
[068] When the combination mode is applied, for example, a list of combination candidates can be generated using a motion vector from a reconstructed adjacent space block and / or a motion vector corresponding to a Col block, which is a adjacent temporal block. A motion vector of a candidate block selected from the list of combination candidates is used as the motion vector of the current block in combination mode. The aforementioned prediction information can include a combination index indicating a candidate block having the best motion vector selected from the candidate blocks included in the combination candidate list. Here, predictor 230 can derive the motion vector from the current block using the combination index.
[069] When MPV (Motion Vector Prediction) mode is applied with
Petition 870190117153, of 11/13/2019, p. 23/53
17/39 In another example, a candidate list of motion vector predictor can be generated using the motion vector from a reconstructed adjacent space block and / or a motion vector corresponding to a Col block which is an adjacent time block. That is, the motion vector of the reconstructed adjacent space block and / or the motion vector corresponding to the Col block which is the adjacent time block can be used as motion vector candidates. The aforementioned prediction information can include a prediction motion vector index indicating the best motion vector selected from the motion vector candidates included in the list. Here, predictor 230 can select a prediction motion vector from the current block from the motion vector candidates included in the motion vector candidate list using the motion vector index. The encoding device predictor can obtain a motion vector difference (MVD) between the motion vector of the current block and a motion vector predictor, encode the MVD and generate the MVD in the form of a bit stream. That is, the MVD can be obtained by subtracting the motion vector predictor from the motion vector of the current block. Here, predictor 230 can obtain a motion vector included in the prediction information and derive the motion vector from the current block by adding the difference from the motion vector to the motion vector predictor. In addition, the predictor can obtain or derive a reference image index indicating a reference image from the previously mentioned information about the prediction.
[070] Adder 240 can add a residual sample to a prediction sample to reconstruct a current block or a current image. Adder 240 can reconstruct the current image by adding the residual sample to the prediction sample in units of a block. When jump mode is applied, a residual is not transmitted and, thus, the prediction sample can become a reconstructed sample. Although adder 240 is described as a separate component, adder
Petition 870190117153, of 11/13/2019, p. 24/53
18/39 pain 240 may be a part of predictor 230. Meanwhile, adder 240 may be referred to as a reconstructed or reconstructed block generator.
[071] Filter 250 can apply the filter block removal, adaptive sample displacement and / or ALF to the reconstructed image. Here, the adaptive sample offset can be applied in units of a sample after the block effect removal filter. ALF can be applied after the block removal filter and / or the application of the adaptive sample displacement.
[072] Memory 260 can store a reconstructed image (decoded image) or information needed for decoding. Here, the reconstructed image can be the reconstructed image filtered by filter 250. For example, memory 260 can store images used for inter-prediction. Here, the images used for inter-prediction can be designated according to a set of reference images or a list of reference images. A reconstructed image can be used as a reference image for other images. Memory 260 can generate reconstructed images in an output order.
[073] Meanwhile, as described above, when performing video encoding, a prediction is made to improve the compression efficiency. Consequently, a predicted block including prediction samples is a current block, that is, a target coding block, can be generated. In this case, the predicted block includes prediction samples in a spatial domain (or pixel domain). The predicted block is derived in the same way in the coding device and in the decoding device. The encoding device can improve the encoding efficiency by signaling residual information in a residue between an original block and the predicted block, which is not an original sample value of the original block itself, to the decoding device. The decoding device can derive a residual block including residual samples based on residual information, it can generate a reconstruction block including reconstruction samples by adding the residual block and the
Petition 870190117153, of 11/13/2019, p. 25/53
19/39 predicted block, and can generate a reconstruction image including the reconstruction blocks.
[074] Residual information can be generated through a transformation and quantization procedure. For example, the coding apparatus can derive the residual block between the original block and the predicted block, it can derive transform coefficients by performing a transform procedure on the residual samples (matrix of residual samples) included in the residual block, it can derive transform coefficients quantized by performing a quantization procedure on the transform coefficients, and can signal residual information related to the decoding device (through a bit stream). In this case, residual information can include information, such as value information, location information, a transform scheme, a transform core, and a quantization parameter for the quantized transform coefficients. The decoding device can perform a reverse quantization / transform procedure based on residual information, and can derive residual samples (or residual block). The decoding device can generate the reconstruction image based on the predicted block and the residual block. The coding apparatus can also derive the residual block by performing a decanting / inverse transform on the quantized transform coefficients for the reference of the inter-prediction of a subsequent image, and can generate the reconstruction image based on the residual block.
[075] Meanwhile, according to the present modality, a multi-transform scheme can be applied when performing the transform described above.
[076] FIG. 3 schematically illustrates a multitransform scheme according to the present embodiment.
[077] Referring to FIG. 3, the transformer can correspond to the transformer of the coding apparatus of FIG. 1. The reverse transformer can match
Petition 870190117153, of 11/13/2019, p. 26/53
20/39 to the reverse transformer of the coding apparatus of FIG. 1 or the reverse transformer of the decoding apparatus of FIG. 2.
[078] The transformer can derive transform coefficients (primary) by performing a primary transform based on the residual samples (residual sample matrix) within a residual block (S310). In this case, the primary transform can include an adaptive multi-core transform.
[079] the adaptive multi-core transform can indicate a method to perform a transform additionally using a discrete cosine transform (DCT) Type 2, a discrete sine transform (DST) Type 7, a DCT Type 8 and / or a DST Type 1. That is, the multi-core transform can indicate a transform method to transform a residual signal (or residual block) of a spatial domain into transform coefficients (or primary transform coefficients) of the frequency domain based on a plurality of transform cores selected from DCT Type 2, DST Type 7, DCT Type 8 and DST Type 1. In this case, the primary transform coefficients can be called temporary transform coefficients from the point of view of the transformer.
[080] in other words, if the existing transform method is applied, the transform coefficients can be generated by applying a transform from a spatial domain to a residual signal (or residual block) to a frequency domain based on in DCT Type 2. In contrast, if the adaptive multi-core transform is applied, the transform coefficients (or primary transform coefficients) can be generated by applying a transform from a spatial domain to a residual signal (or block residual domain) to a frequency domain based on DCT Type 2, DST Type 7, DCT Type 8 and / or DST Type 1. In this case, DCT Type 2, DST Type 7, DCT Type 8 and Type 1 STDs can be called transform type or core
Petition 870190117153, of 11/13/2019, p. 27/53
21/39 transformed.
[081] By reference, DCT / DST transform types can be defined based on basic functions. The basic functions can be represented as follows.
[Table 1]
Kind ofTransformed Base function 7i (/), /, j = 0, 1, ..., N- 1 DCT-II ...................... J í S3 where yt í - ⁰ * 1 i -7 ^: £ 1 DCT-V 1 H ~ ή (U í ΛM. (1 / Ô DCT-VIIISTD-I _{W)! 1} . [Z. _5CT g.:2A.i DST-VII T. í U ™ í --- sen J- + ϊ /
[082] If the adaptive multi-core transform is performed, a vertical transform core and a horizontal transform core for a target block can be selected from among the transform cores. A vertical transform for a target block can be performed based on the vertical transform core. A horizontal transform for the target block can be performed based on the horizontal transform core. In this case, the horizontal transform can indicate a transform for the horizontal components of the target block. The vertical transform can indicate a transform for the vertical components of the target block. The vertical transform core / horizontal transform core can be adaptively determined based on a target block prediction mode (CU or sub-block) comprising a residual block and / or a transformation index
Petition 870190117153, of 11/13/2019, p. 28/53
22/39 of the indicative of a subset of transform.
[083] The transformer can derive transform coefficients (secondary) by performing a secondary transform based on the transform coefficients (primary) (S320). If the primary transform were a transform from the spatial domain to the frequency domain, the secondary transform can be considered to be a transform from the frequency domain to the frequency domain. The secondary transform can include a non-separable transform. In this case, the secondary transform can be called a non-separable secondary transform (NSST). The non-separable secondary transform can indicate a transform to generate transform coefficients (or secondary transform coefficients) for a residual signal by performing a secondary transform on the (primary) transform coefficients, derived through the primary transform, based on a matrix of non-separable transformed. In this case, the vertical transform and the horizontal transform are separated (or the horizontal and vertical transform independently) and are not applied, but the transforms can be applied to the (primary) transform coefficients at once based on the non-separable transform matrix. . In other words, the non-separable secondary transform can indicate a transform method to generate transform coefficients (or secondary transform coefficients) by performing transforms without separating the vertical components and horizontal components from the (primary) transform coefficients based on a matrix of non-separable transformed. The non-separable secondary transform can be applied to the upper-left region of a block (hereinafter referred to as the transform coefficient block) configured with the transform coefficients (primary). For example, if each of the width (W) and height (H) of the transform coefficient block is 8 or more, an 8x8 non-separable secondary transform can be applied to the upper 8x8 region.
Petition 870190117153, of 11/13/2019, p. 29/53
23/39 left of the transform coefficient block. Additionally, if each of the width (W) and height (H) of the transform coefficient block is less than 8, a non-separable secondary transform of 4x4 can be applied to the min (8, W) xmin ( 8, H) upper-left of the transform coefficient block.
[084] Specifically, for example, if a 4x4 input block is used, a non-separable secondary transform can be performed as follows.
[085] The 4χ4 X input block can be represented as follows.
[Equation 1]
Γ * οο * 01 * 02 * 031 X - * 10 * 11 * 12 * 13* 20 * 21 * 22 * 23- * 30 * 31 * 32 * 33-
[086] If ο X is indicated in a vector form, a vector X can be represented as follows.
[Equation 2]
X ⁼ [* 00 * 01 * 02 * 03 * 10 * 11 * 12 * 13 * 20 * 21 * 22 * 23 * 30 * 31 * 32 ^ 3s] ^T [087] In this case, the secondary non-separable transform can be calculated as follows.
[Equation 3]
F = TX [088] In this case, F indicates a transform coefficient vector, and T indicates a 16x16 transform matrix (non-separable) [089] A 16x1 F transform coefficient vector can be derived by Equation 3. The F can be rearranged in a 4x4 block through
Petition 870190117153, of 11/13/2019, p. 30/53
24/39 through a scan order (horizontal, vertical, diagonal). However, in the above calculation, for example, a Hypercube-Givens Transform (HyGT) can be used to calculate a non-separable secondary transform in order to reduce the computational load of a non-separable secondary transform.
[090] Meanwhile, in the non-separable secondary transform, a transform core (or type of transform) can be selected based on a mode (mode dependent). In this case, the mode can include an intra prediction mode and / or an inter prediction mode.
[091] As described above, the non-separable secondary transform can be performed based on an 8x8 transform or 4x4 transform determined based on the width (W) and height (H) of the transform coefficient block. That is, the non-separable secondary transform can be performed based on a sub-block size of 8x8 or a sub-block size of 4x4. For example, in order to select a mode-based transform core, 35 sets of non-separable secondary transform cores, each set having three cores, for a non-separable secondary transform, can be configured for both the size of sub- 8x8 block and 4x4 sub-block size. That is, 35 transform sets can be configured for the 8x8 sub-block size, and 35 transform sets can be configured for the 4x4 sub-block size. In this case, three 8x8 transform cores can be included in each of the 35 transform sets for the 8x8 subblock size. In this case, three 4x4 transform cores can be included in each of the 35 transform sets for the 4x4 sub-block size. However, the size of the transform sub-block, the number of sets, and the number of cores within a set are examples, and a size other than 8x8 or 4x4 can be used or n sets can be configured, and k transform cores can be included in each
Petition 870190117153, of 11/13/2019, p. 31/53
25/39 set.
[092] The transform set can be called the NSST set. A transform core within the NSST set can be called an NSST core. The selection of a specific set of transform sets can be performed based on the intra prediction mode of a target block (CU or sub-block), for example.
[093] By reference, for example, a prediction mode can include two non-directional (or non-angular) intra prediction modes and 65 intra-directional (or angular) prediction modes. The non-directional intra prediction mode can include an intra N-0 (planar) prediction mode and an intra DC NM prediction mode. Intra-directional prediction modes can include sixty-five intra-prediction modes from N-2 to N-66. However, they are examples, and the present modality can be applied to a case where the number of intra-prediction modes is different . Meanwhile, in some cases, the N-67 intra prediction mode can be additionally used. The N ² 67 intra prediction mode can indicate a linear model (LM) mode.
[094] FIG. 4 illustrates 65 intra direction modes of a prediction mode.
[095] Referring to FIG. 4, the modes can be divided into intra prediction modes having horizontal directionality and intra prediction modes having vertical directionality based on an N-34 intra prediction mode having a left-up diagonal prediction direction. In FIG. 3, H and V signify horizontal directionality and vertical directionality, respectively, and the numbers -32 to 32 indicate the 1/32 unit displacement at a location in the sample grid. The intra prediction modes N ² 2 to N ² 33 have horizontal directionality, and the intra prediction modes N- 34 to N- 36 have vertical directionality. The intra prediction mode N ² 18 and the intra prediction mode N50 indicate an intra horizontal prediction mode and an intra verti prediction mode
Petition 870190117153, of 11/13/2019, p. 32/53
26/39 cal, respectively. The N-2 intra prediction mode can be called the left-descending intra diagonal prediction mode, the N-34 intra prediction mode can be called the left-up intra diagonal prediction mode, and the N- intra prediction mode. 66 can be called the right-upward intra-diagonal prediction mode.
[096] In this case, the mapping between the 35 transform sets and the intra prediction modes can be indicated as in the following table, for example. By reference, if an LM mode is applied to a target block, a secondary transform may not be applied to the target block.
[Table 2]
infra mode 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 d 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 set 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 intra mode 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 5253 54 5S 56 57 58 59 60 61 62 63 64 65 66 6 / i..M) set 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 | 1S 14 13 12 11 10 9 8 7 6 5 4 3 2 NULL
[097] Meanwhile, if a specific set is determined to be used, one of the three transform cores within the specific set can be selected based on an NSST index. The coding apparatus can derive the NSST index which indicates a specific transform core based on the rate-distortion (RD) check. The NSST index can be signaled to the decoding device. The decoding device can select one of the three transform cores within a specific set based on the NSST index. For example, an NSST index value 0 can indicate the first NSST core, an NSST index value 1 can indicate the second NSST core, and an NSST index value 2 can indicate the third NSST core. Alternatively, the NSST index value 0 may indicate that an NSST is not applied to a target block. NSST index values 1 to 3 can indicate the three transform cores.
[098] Referring again to FIG. 3, the transformer can perform the non-separable secondary transform based on a selected transform core, and can obtain transform coefficients (secondary). The coefficients
Petition 870190117153, of 11/13/2019, p. 33/53
27/39 (secondary) transforms can be derived as quantized transform coefficients via the quantization unit as described above, can be encoded and signaled to the decoding apparatus, and can be transmitted to the decanting / reverse transformer of the encoding apparatus .
[099] The reverse transformer can perform a series of procedures in the reverse order of the procedures performed on the transformer. The reverse transformer can receive transform coefficients (unquantified) (secondary), can derive transform coefficients (primary) by performing a secondary transform (S350), and can obtain a residual block (residual samples) by performing a primary transform in the transform coefficients ( primary). In this case, the primary transform coefficients can be called modified transform coefficients from the point of view of the reverse transformer. The encoding apparatus and the decoding apparatus may generate a reconstruction block based on the residual block and the predicted block and generate a reconstruction image based on the reconstruction block, as described above.
[0100] Meanwhile, the size of a transform core (NSST core) for a non-separable secondary transform can be fixed or may not be fixed, and the transform core (NSST core) can be configured together with transform cores. having different sizes within a set.
[0101] For example, an NSST 4x4 set includes only NSST 4x4 cores and an NSST set of 8x8 includes only NSST 8x8 cores, depending on the size of a target block (or sub-block or transform coefficient block).
[0102] For another example, a mixed NSST set can be configured as follows. The mixed NSST set can include NSST cores having different sizes. For example, the mixed NSST set can include a 4x4 NSST core in addition to an 8x8 NSST core. An NSST set including only the
Petition 870190117153, of 11/13/2019, p. 34/53
28/39 NSST cores of 8 * 8 or NSST cores of 4x4 compared to a mixed NSST set can be called a non-mixed NSST set.
[0103] The number of NSST cores included in the mixed NSST set can be fixed or can be variable. For example, an NSST # 1 pool can include 3 NSST cores, and an NSST # 2 pool can include 4 NSST cores. In addition, the sequence of the NSST cores included in the mixed NSST set may not be fixed and may be defined differently, depending on an NSST set. For example, in the NSST set # 1, the NSST cores 1, 2, and 3 can be mapped as respective NSST indices 1,2, and 3. In the NSST set # 2, the NSST cores 3, 2, and 1 can be mapped as respective NSST indices 1,2, and 3.
[0104] Specifically, a determination of the priority of the NSST cores available within an NSST set can be based on the size (for example, 8x8 NSST core or 4x4 NSST core) of the NSST cores. For example, if a corresponding target block is of a given size or more, an 8x8 NSST core may have higher priority than a 4x4 NSST core. In this case, an NSST index having a lower value can preferably be assigned to the 8x8 NSST core.
[0105] Additionally, a determination of the priority of the NSST cores available within an NSST set can be based on the sequence (1 ^â , 2-, and
3-) NSST nuclei. For example, a 1 ² NSST 4x4 core may have higher priority than a 2 - NSST 4x4 core.
[0106] Specifically, for example, the mapping of NSST cores and NSST indexes within an NSST set can include the modalities shown in Table 3 or 4.
[Table 3]
NSST index index NSST 4x4 set NSST 8x8 set NSST Mixed Set 1 1- Core of 4x4 1 ² 8x8 core 1 ² 8x8 core 2 2 ² Core 4x4 2 ² 8x8 core 2 ² 8x8 core
Petition 870190117153, of 11/13/2019, p. 35/53
29/39
3 32 Core 4x4 32 8x8 core 12 Core 4x4
[Table 4]
index indexNSST NSST Mixed Type 1 Set NSST Mixed Type 2 Set NSST Mixed Type 3 Set 1 3 ² 8x8 core 1 ² 8x8 core 12 Core 4x4 2 2 ² 8x8 core 2 ² 8x8 core 12 8x8 core 3 12 8x8 core 12 Core 4x4 22 Core 4x4 4 N.D. 22 Core 4x4 22 8x8 core 5 N.D. 32 Core 4x4
[0107] The decision as to whether a mixed NSST set is used can be indicated by several methods. For example, the decision as to whether a mixed NSST set is used can be determined based on an intra prediction of a target block (or Cll including a target block) and / or the size of the target block.
[0108] For example, the decision as to whether a mixed NSST set is used can be determined based on an intra prediction mode of a target block. In other words, the decision as to whether the mixed NSST set is used based on an intra prediction mode or whether an individual NSST set based on the sub-block size is used may have been predetermined. Consequently, an NSST set suitable for a current target block can be determined, and an appropriate NSST core can be applied. For example, the decision as to whether the mixed NSST set is used can be indicated as in the following table, depending on an intra prediction mode.
[Table 5]
í Mocfcfatra 0 ΐ 2 | 3 [4 5 6Í 7 8 9 10 | 11 1213 14 ^ 1 17 18 19 20 21 22 23 34: 25 26 27 28 29 3Q 31 32 33 I Type Mist-o 1 1 1 11 1 0 0 0 0 O 0 | 0 0 0 01 1 1 1 1 1 0 0 0 o | 0 0 0 0 0 0 0 1 1 1 totra mode1 T; po Mixed 341 351 36 37j38 'j' | 0 | Õ 39 400 41 42'οΐΰ 43 0 4 th 46M70Γ0 4aUa.5cΤ | Τ | ’Γ 511 521 53 0 54 0 55 0 5603 hi 0 60O 610 62 0 630 640 650 6967 1 [Ô
[0109] In this case, the mixed type information indicates whether the mixed NSST set is applied to the target block based on an intra prediction mode. This can
Petition 870190117153, of 11/13/2019, p. 36/53
30/39 be used in association with the method revealed in Table 2. For example, mixed type information can indicate whether a non-mixed NSST set will be mapped and used for each intra prediction mode or whether a mixed NSST set will be configured and used as described above in Table 2. Specifically, if a value of mixed type information is 1, a mixed NSST set defined in a system instead of a non-mixed NSST set can be configured and used. In this case, the mixed NSST set defined in the system can indicate the mixed NSST set. If a value of the mixed type information is 0, the non-mixed NSST set can be used based on an intra prediction mode. Mixed type information can be called a mixed type flag indicating whether the mixed NSST set is used. According to the present modality, two types of NSST sets (non-mixed NSST set and mixed NSST set) can be used in an adaptive / viable way based on a mixed type flag.
[0110] Meanwhile, two or more mixed NSST sets can be configured. In this case, mixed type information can be indicated as N (N can be greater than or equal to 2) types of various values. In this case, mixed type information can be called mixed type index.
[0111] For another example, the decision as to whether a mixed NSST set is used can be determined by considering an intra prediction mode associated with a target block and the size of the target block at the same time. The target block can be called several names, such as a sub-block, a transform block, and a transform coefficient block.
[0112] For example, mode type information can be configured in place of mixed type information. If a value of the mode type information corresponding to an intra prediction mode is 0, a non-mixed NSST set can be defined. If not (for example, a value of the mode type information is 1), several mixed NSST sets can be determined based on the
Petition 870190117153, of 11/13/2019, p. 37/53
31/39 size of a corresponding target block. For example, if an intra mode is a non-directional mode (Planar or DC), a mixed NSST can be used. If an intra mode is a directional mode, a non-mixed NSST set can be used.
[0113] FIG. 5 illustrates a method of determining an NSST set based on an intra prediction mode and a block size.
[0114] Referring to FIG. 5, the coding apparatus (coding apparatus and / or the decoding apparatus) derives transform coefficients (secondary) by the inverse transformation of the transform coefficients (quantized) (S540), and derives transform coefficients (primary) by the transformation ( inverse) of the transform (secondary) coefficients (S550). In this case, the transform coefficients (secondary) can be called temporary transform coefficients, and the transform coefficients (primary) can be called modified transform coefficients. In this case, the secondary transform can include the non-separable secondary transform. The secondary non-separable transform is performed based on an NSST core. The NSST core can be selected from an NSST set. In this case, the NSST core can be indicated from the NSST set based on the NSST index information.
[0115] The coding apparatus can select the NSST set among the candidates of the NSST set based on an intra prediction mode and a block size (S545). For example, NSST pool candidates can include at least one non-mixed NSST pool and at least one mixed NSST pool. For example, NSST pool candidates may include at least one from an 8x8 pool (non-mixed NSST pool 1), including only 8x8 NSST cores, and a 4x4 NSST pool (non-mixed NSST pool 2) including only NSST 4x4 cores, and may include one or more mixed NSST assemblies. In this case, for example, the encoding device can determine a set
Petition 870190117153, of 11/13/2019, p. 38/53
32/39
Specific NSST from the NSST pool candidates based on whether each of the width (W) and height (H) of a target block is 8 or more and based on a current intra prediction mode number. A specific NSST core can be indicated from the specific NSST set using the NSST index information as described above.
[0116] Meanwhile, the NSST index can be binarized using various methods for coding efficiency. In this case, a binary value can be defined efficiently considering a change in the statistical distribution of the NSST index values that are encoded and transmitted. That is, in this case, a kernel to be actually applied can be selected based on the syntax indicative of a kernel size.
[0117] As described above, according to the present modality, the number of NSST cores included in each transform set (NSST set) may be different. For an efficiency binarization method, variable length binarization can be performed based on a truncated nail (TU) as in the following table based on a maximum NSST index value available for each NSST set.
[Table 6]
indexNSST Binarization1 (maximum index ^) Binarization 2 (maximum index ^) Binarization 3 (maximum index ^) Binarization 4 (maximum index ^)0 0 0 0 01 10 10 10 102 11 110 110 1103 N.D. 111 1110 11104 N.D. 1111 111105 N.D. 11111 N.D.
[0118] In this case, the binary values of "0" or Ί "can be called binary. In this case, each of the binaries can be encoded based on the context using CABAC / CAVLC. In this case, a context modeling value
Petition 870190117153, of 11/13/2019, p. 39/53
33/39 can be determined based at least on the size of a target block (subblock, transform block or transform coefficient block), in an intra prediction mode, on a mixed information value (mixed mode information) , or a maximum NSST index value from a corresponding NSST set. In this case, the context model can be indicated based on a context index. The context index can be indicated as the sum of a context deviation and a context increment.
[0119] FIG. 6 schematically illustrates an example of a video / image encoding method including a transform method in accordance with the present embodiment. The method disclosed in FIG. 6 can be performed by the coding apparatus disclosed in FIG. 1. Specifically, for example, S600 to S630 of FIG. 6 can be performed by the transformer of the coding apparatus.
[0120] Referring to FIG. 6, the coding apparatus obtains transform coefficients for a target block (S600). The coding apparatus can obtain residual samples for the target block by comparing an original block and a predicted block, and can obtain the transform coefficients for the target block through the primary transform of the residual samples. The primary transform includes a procedure for transforming residual samples in a spatial domain into transform coefficients in a frequency domain. In this case, the target block can include a sub-block, transform block or transform coefficient block within a Cll.
[0121] The encoding device determines an NSST set for the target block (S610). The NSST set can include an NSST core used for a secondary transform. The secondary transform includes a non-separable secondary transform. The NSST set for the target block can be determined based on at least one of an intra prediction mode and the size of the target block.
[0122] NSST set can include 8x8 NSST cores or NSST cores
Petition 870190117153, of 11/13/2019, p. 40/53
34/39 4x4. In this case, the NSST set can be called a non-mixed NSST set. The decision as to whether the NSST set includes 8x8 NSST cores or includes 4x4 NSST cores can be determined based on the size of the target block, as described above.
[0123] Alternatively, the NSST set can be a mixed NSST set including a 4x4 NSST core and an 8x8 NSST core. In this case, an index value assigned to the NSST core of 8x8 can be less than an index value assigned to the NSST core of 4x4. For example, if the target block size is larger than a predefined reference size, an index value assigned to the NSST core of 8x8 may be less than an index value assigned to the NSST core of 4x4. Alternatively, on the contrary, an index value assigned to the NSST core of 4x4 may be less than an index value assigned to the NSST core of 8x8.
[0124] The NSST set can include a plurality of NSST cores. The number of NSST cores can be defined in a variable way. For example, the number of NSST cores included in a first NSST set may be different from the number of NSST cores included in a second NSST set.
[0125] Meanwhile, the decision as to whether a non-mixed NSST is used or a mixed NSST set is used as the NSST set for the target block can be indicated based on mixed-type information or mixed-mode information.
[0126] For example, if a value of mixed type information is 0, you can use a non-mixed NSST set including NSST 8x8 cores or NSST 4x4 cores. If the value of the mixed type information is not 0, a mixed NSST set can be used, including an NSST core of 4x4 and an NSST core of 8x8. If a plurality of mixed NSST sets is available, one of the plurality of mixed NSST sets can be indicated based on a value 1, 2, etc.
Petition 870190117153, of 11/13/2019, p. 41/53
35/39 of mixed type information.
[0127] The NSST set for the target block can be determined based on both an intra prediction mode and the size of the target block. The intra prediction mode can be one of 67 (68 if an LM mode is included) intra prediction modes including an LM mode, for example. The intra prediction mode can be a prediction mode associated with the target block or it can be an intra prediction mode configured in a CU spatially covering the target block or a sub-block thereof.
[0128] The encoding device selects one from a plurality of NSST cores included in the NSST set and defines an NSST index (S620). The coding apparatus can select one of the plurality of NSST cores included in the NSST set by calculating repetition based on an RD cost. The coding apparatus can define the NSST index as an indicative value of the selected NSST core.
[0129] The coding device generates transform coefficients modified by the non-separable secondary transformation of the transform coefficients based on the selected NSST core (S630). The coding apparatus can code and generate the transformed transform coefficients according to a determined procedure. In this case, at least one of the mixed type information, the mixed mode information and the information on the NSST index can be encoded as follows. The encoding apparatus can generate the encoded information in the form of a bit stream. The bit stream can be transmitted to the decoding apparatus via a network or via a storage medium.
[0130] If the information on the NSST index is encoded, an NSST index value can be combined with variable length. In this case, for example, as shown in Table 6, the NSST index value can be
Petition 870190117153, of 11/13/2019, p. 42/53
36/39 according to a truncated unary scheme (TU). Meanwhile, the NSST index value can be encoded based on context, such as CABAC or CAVLC. In this case, a context model can be determined based on at least one of the target block size, an intra prediction mode, a mixed information value and a maximum index value within the NSST set.
[0131] FIG. 7 schematically illustrates an example of a video / image decoding method including a transform method in accordance with the present embodiment. The method disclosed in FIG. 7 can be performed by the decoding apparatus disclosed in FIG. 2. Specifically, for example, in FIG. 7, S700 can be realized by the decoding device decanting unit, and S710 to S730 can be realized by the reverse transformer of the decoder device. Meanwhile, in the present embodiment, the decoding apparatus is basically described, but the method disclosed in FIG. 7 can be carried out in an identical manner in the decanting unit and reverse transformer of the coding apparatus.
[0132] Referring to FIG. 7, the decoding apparatus obtains transform coefficients for a target block (S700). The decoding device can obtain the transform coefficients by decanting the quantized transform coefficients for the target block, obtained from the information received through a bit stream. In this case, the target block can include a sub-block, transform block or transform coefficient block within a CU.
[0133] The decoding device determines an NSST set for the target block (S710). The NSST set can include an NSST core for a secondary transform. The secondary transform includes a non-separable secondary transform. The NSST set for the target block can be determined based on at least one of an intra prediction mode and the size of the target block.
[0134] The NSST set can include 8x8 NSST cores or NSST cores
Petition 870190117153, of 11/13/2019, p. 43/53
37/39 4x4. In this case, the NSST set can be called a non-mixed NSST set. The decision as to whether the NSST set includes 8x8 NSST cores or includes 4x4 NSST cores can be determined based on the size of the target block, as described above.
[0135] Alternatively, the NSST set can be a mixed NSST set including a 4x4 NSST core and an 8x8 NSST core. In this case, an index value assigned to the NSST core of 8x8 can be less than an index value assigned to the NSST core of 4x4. For example, if the target block size is larger than a predefined reference size, an index value assigned to the NSST core of 8x8 may be less than an index value assigned to the NSST core of 4x4. Alternatively, on the contrary, an index value assigned to the NSST core of 4x4 may be less than an index value assigned to the NSST core of 8x8.
[0136] The NSST set can include a plurality of NSST cores. The number of NSST cores can be defined in a variable way. For example, the number of NSST cores included in a first NSST set may be different from the number of NSST cores included in a second NSST set.
[0137] Meanwhile, the decision as to whether a non-mixed NSST is used or a mixed NSST set is used as the NSST set for the target block can be determined based on mixed type information or mixed mode information.
[0138] For example, if a value of mixed type information is 0, you can use a non-mixed NSST set including 8x8 NSST cores or 4x4 NSST cores. If the value of the mixed type information is not 0, a mixed NSST set can be used, including an NSST core of 4x4 and an NSST core of 8x8. If a plurality of mixed NSST sets is available, one of the plurality of mixed NSST sets can be indicated based on a value 1, 2, etc.
Petition 870190117153, of 11/13/2019, p. 44/53
38/39 of the mixed type information.
[0139] The NSST set for the target block can be determined based on both an intra prediction mode and the size of the target block. The intra prediction mode can be one of 67 (68 if an LM mode is included) intra prediction modes, for example. The intra prediction mode can be a prediction mode associated with the target block or it can be an intra prediction mode configured in a Cll spatially covering the target block or a sub-block thereof.
[0140] The decoding device selects one from a plurality of NSST cores, included in the NSST set, based on an NSST index (S720). The NSST index can be obtained through a bit stream. The decoding device can obtain an NSST index value through decoding (by entropy). The NSST index value can be binary with variable length. In this case, for example, as shown in Table 6, the NSST index value can be binarized according to a truncated unary scheme (TU). Meanwhile, the NSST index value can be decoded based on context, such as CABAC or CAVLC. In this case, a context model can be determined based on at least one of the target block size, an intra prediction mode, a mixed information value and a maximum index value within the NSST set.
[0141] The decoding device generates transform coefficients modified by the secondary (inverse) transformation not separable from the transform coefficients based on the selected NSST core (S730). The decoding device can obtain residual samples for the target block by performing a primary (inverse) transform on the modified transform coefficients.
[0142] The decoding apparatus can obtain reconstruction samples by combining prediction samples obtained based on the results of intra prediction and residual samples, and can reconstruct an image based on the samples
Petition 870190117153, of 11/13/2019, p. 45/53
39/39 reconstruction trams.
[0143] Then, the decoding device can apply a mesh filtering procedure, such as a block effect removal filter, a SAO and / or ALF procedure, to the reconstructed image in order to improve the subjective / objective quality image, if necessary, as described above.
[0144] The method according to the present modality can be implemented in a software form. The coding apparatus and / or the decoding apparatus according to the present embodiment can be included in an apparatus for carrying out image processing, such as a TV, a computer, a smartphone, a set-top box, or a display apparatus .
[0145] In the present modality, if the modalities are implemented in software, the method can be implemented as a module (process or function) that performs the above function. The module can be stored in memory and run by a processor. The memory can be positioned inside or outside the processor, and can be connected to the processor by several well-known means. The processor may include an application specific integrated circuit (ASIC), other chipsets, logic circuits and / or data processors. The memory may include a read-only memory (ROM), a random access memory (RAM), a flash memory, a memory card, a storage medium and / or other storage devices.

权利要求:
Claims (15)
[1]
1. Transformation method performed by a decoding device, the method being CHARACTERIZED because it comprises:
receive a non-separable transform index;
obtain transform coefficients for a target block;
determine a set of non-separable transform for the target block;
selecting one of a plurality of non-separable transform cores included in the non-separable transform set in the non-separable transform index; and generate transform coefficients modified by the non-separable transformation of the transform coefficients based on the selected non-separable transform core, where the set of non-separable transform for the target block is determined based on at least one of an intra prediction mode and a target block size, and in which a non-separable transform index value is represented based on a truncated unary binarization (TU).
[2]
2. Method, according to claim 1, CHARACTERIZED by the fact that the non-separable transform set is a mixed non-separable transform set comprising a 4x4 non-separable transform core and an 8x8 non-separable transform core .
[3]
3. Method, according to claim 2, CHARACTERIZED by the fact that an index value assigned to the non-separable transform core of 8x8 is less than an index value assigned to the non-separable transform core of 4x4.
[4]
4. Method, according to claim 2, CHARACTERIZED by the fact that, if the size of the target block is greater than a reference size
Petition 870190094488, of 9/20/2019, p. 16/20
2/5 default, an index value assigned to the 8x8 non-separable transform core is less than an index value assigned to the non-separable 4x4 transformation core.
[5]
5. Method, according to claim 2, CHARACTERIZED by the fact that a number of the non-separable transform cores included in the non-separable transform set is variable.
[6]
6. Method according to claim 1, CHARACTERIZED by additionally comprising:
obtain mixed type information; and determining whether a mixed non-separable transform set is used based on the mixed type information.
[7]
7. Method, according to claim 6, CHARACTERIZED by the fact that:
if a value of mixed type information is 0, a non-separable non-separable transform set is used including 8x8 non-separable transform cores or 4x4 non-separable transform cores, and if a value of mixed type information if not 0, a mixed non-separable transform set including a 4x4 non-separable transform core and an 8x8 non-separable transform core is used.
[8]
8. Method, according to claim 1, CHARACTERIZED by the fact that the set of non-separable transform for the target block is determined based on both the intra prediction mode and the size of the target block.
[9]
9. Method, according to claim 1, CHARACTERIZED by the fact that the value of the index of non-separable transform is binahzado with variable length.
[10]
10. Method, according to claim 1, CHARACTERIZED by the fact that a maximum value of the non-separable transform index is 2, and
Petition 870190094488, of 9/20/2019, p. 17/20
3/5 where the value 0 of the non-separable transform index is represented by the binary character string “0”, the value 1 of the non-separable transform index is represented by the binary character string “10”, and the value 2 of the non-separable transform index is represented by the binary string “11”.
[11]
11. Method, according to claim 1, CHARACTERIZED by the fact that:
a non-separable transform index value is obtained based on context-based decoding, and a context model for context-based decoding of the non-separable transform index value is determined based on at least one of the size of the target block, the intra prediction mode, a mixed type information value and a maximum index value within the non-separable transform set.
[12]
12. Transformation method performed by a coding device, the method being CHARACTERIZED because it comprises:
obtain transform coefficients for a target block;
determine a set of non-separable transform for the target block;
selecting one of a plurality of non-separable transform cores included in the non-separable transform set, wherein the selected non-separable transform core is used for non-separable transformation of the transform coefficients for the target block;
generate a non-separable transform index specifying the non-separable transform core selected from the non-separable transform set; and encode information in the non-separable transform index, where the set of non-separable transform for the target block is
Petition 870190094488, of 9/20/2019, p. 18/20
4/5 determined based on at least one of an intra prediction mode and a target block size, and in which a non-separable transform index value is represented based on a truncated unary binarization (TU).
[13]
13. Method, according to claim 16, CHARACTERIZED by the fact that a maximum value of the non-separable transform index is 2, and in which the value 0 of the non-separable transform index is represented by the binary string “ 0 ”, the value 1 of the non-separable transform index is represented by the binary character string“ 10 ”, and the value 2 of the non-separable transform index is represented by the binary character string“ 11 ”.
[14]
14. Digital storage medium storing information that causes a decoding device to perform a transform method, the method being CHARACTERIZED for understanding:
obtain a non-separable transform index;
obtain transform coefficients for a target block;
determine a set of non-separable transform for the target block;
selecting one of a plurality of non-separable transform cores included in the non-separable transform set in the non-separable transform index; and generate transform coefficients modified by the non-separable transformation of the transform coefficients based on the selected non-separable transform core, where the set of non-separable transform for the target block is determined based on at least one of an intra prediction mode and a target block size, and in which a non-separable transform index value is represented
Petition 870190094488, of 9/20/2019, p. 19/20
5/5 based on truncated unary binarization (TU).
[15]
15. Digital storage medium, according to claim 18, CHARACTERIZED by the fact that a maximum value of the non-separable transform index is 2, and where the value 0 of the non-separable transform index is represented by the chain binary characters “0”, the non-separable transform index value 1 is represented by the binary character string “10”, and the non-separable transform index value 2 is represented by the binary character string “11”.

类似技术:

公开号 | 公开日 | 专利标题

BR112019019702A2|2020-04-14|transformation method into an image coding system and apparatus for the same

CN110073661B|2021-09-14|Method and apparatus for encoding and decoding video data

US10674146B2|2020-06-02|Method and device for coding residual signal in video coding system

ES2700621T3|2019-02-18|Device to decode an image

CA3007664A1|2017-07-20|Multi-type-tree framework for video coding

US10721479B2|2020-07-21|Intra prediction method and apparatus in image coding system

KR20210092337A|2021-07-23|Inter prediction method and apparatus therefor

ES2743227T3|2020-02-18|Method to encode / decode image

US10701356B2|2020-06-30|Image decoding method and apparatus by deriving a frequency component based on filtering in image coding system

US20210195191A1|2021-06-24|Image encoding method/device, image decoding method/device and recording medium having bitstream stored therein

BR112019014090A2|2020-02-04|intraprevision techniques for video encoding

KR20180037581A|2018-04-12|Method and apparatus for encoding/decoding image and recording medium for storing bitstream

US10694187B2|2020-06-23|Method and device for deriving block structure in video coding system

US20210176499A1|2021-06-10|Image decoding method and device in accordance with block split structure in image coding system

BR112019022971B1|2022-02-15|VIDEO DECODING METHOD PERFORMED BY A DECODING DEVICE, VIDEO ENCODING METHOD PERFORMED BY AN ENCODING DEVICE, AND COMPUTER READable STORAGE MEDIA

BR122021003526B1|2022-02-15|VIDEO DECODING METHOD PERFORMED BY A DECODING DEVICE, VIDEO ENCODING METHOD PERFORMED BY AN ENCODING DEVICE, AND COMPUTER READable STORAGE MEDIA

US10924762B2|2021-02-16|Image decoding method and apparatus relying on intra prediction in image coding system

US11070831B2|2021-07-20|Method and device for processing video signal

US20190230352A1|2019-07-25|Image decoding method and apparatus in image coding system

同族专利:

公开号 | 公开日

AU2018239635A1|2019-10-17|

RU2733279C2|2020-10-01|

EP3588952A4|2020-03-11|

RU2020116853A|2020-06-08|

KR102171362B1|2020-10-28|

JP2020510374A|2020-04-02|

US20200177889A1|2020-06-04|

MX2019011211A|2019-12-05|

RU2722394C1|2020-05-29|

RU2020116853A3|2020-07-24|

EP3588952A1|2020-01-01|

CN110546952A|2019-12-06|

EP3588952B1|2021-04-28|

CA3057445A1|2018-09-27|

KR20190125482A|2019-11-06|

WO2018174402A1|2018-09-27|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

KR100261253B1|1997-04-02|2000-07-01|윤종용|Scalable audio encoder/decoder and audio encoding/decoding method|

EP1510078B1|2002-05-28|2007-04-25|Sharp Kabushiki Kaisha|Methods and systems for image intra-prediction mode estimation, communication, and organization|

KR20210064398A|2009-01-27|2021-06-02|인터디지털 브이씨 홀딩스 인코포레이티드|Methods and apparatus for transform selection in video encoding and decoding|

JP5158003B2|2009-04-14|2013-03-06|ソニー株式会社|Image coding apparatus, image coding method, and computer program|

KR101620441B1|2009-06-17|2016-05-24|주식회사 아리스케일|Method for multiple interpolation filters, and apparatus for encoding by using the same|

US20120127003A1|2009-08-06|2012-05-24|Youji Shibahara|Coding method, decoding method, coding apparatus, and decoding apparatus|

CN102215391B|2010-04-09|2013-08-28|华为技术有限公司|Video data encoding and decoding method and device as well as transform processing method and device|

WO2011125313A1|2010-04-09|2011-10-13|三菱電機株式会社|Video encoding device and video decoding device|

US9661338B2|2010-07-09|2017-05-23|Qualcomm Incorporated|Coding syntax elements for adaptive scans of transform coefficients for video coding|

CN101938654B|2010-08-17|2013-04-10|浙江大学|Method and device for optimizing and quantifying conversion coefficients|

US10992958B2|2010-12-29|2021-04-27|Qualcomm Incorporated|Video coding using mapped transforms and scanning modes|

US8755620B2|2011-01-12|2014-06-17|Panasonic Corporation|Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus for performing arithmetic coding and/or arithmetic decoding|

CN103096053B|2011-11-04|2015-10-07|华为技术有限公司|A kind of decoding method of pattern conversion and device|

WO2014084656A1|2012-11-29|2014-06-05|엘지전자 주식회사|Method and device for encoding/ decoding image supporting plurality of layers|

US10681379B2|2015-09-29|2020-06-09|Qualcomm Incorporated|Non-separable secondary transform for video coding with reorganizing|US20220078483A1|2018-12-19|2022-03-10|Lg Electronics Inc.|Method and device for processing video signal by using intra-prediction|

KR20210102462A|2019-02-24|2021-08-19|엘지전자 주식회사|Video coding method and apparatus based on quadratic transformation|

CN113596447A|2019-03-09|2021-11-02|杭州海康威视数字技术股份有限公司|Method, decoding end, encoding end and system for encoding and decoding|

EP3949423A1|2019-04-16|2022-02-09|Mediatek Inc.|Methods and apparatuses for coding video data with secondary transform|

CN113826395A|2019-04-16|2021-12-21|Lg电子株式会社|Matrix-based intra-prediction transform in image coding|

WO2020226424A1|2019-05-08|2020-11-12|엘지전자 주식회사|Image encoding/decoding method and device for performing mip and lfnst, and method for transmitting bitstream|

WO2020253642A1|2019-06-15|2020-12-24|Beijing Bytedance Network Technology Co., Ltd.|Block size dependent use of secondary transforms in coded video|

CN114009023A|2019-06-19|2022-02-01|Lg电子株式会社|Image coding method and device based on transformation|

KR20210002106A|2019-06-25|2021-01-06|주식회사 윌러스표준기술연구소|Video signal processing method and apparatus using quadratic transformation|

CN112135148A|2019-06-25|2020-12-25|华为技术有限公司|Non-separable transformation method and device|

WO2021010680A1|2019-07-12|2021-01-21|엘지전자 주식회사|Image coding method based on transform, and device for same|

WO2021054797A1|2019-09-19|2021-03-25|주식회사 윌러스표준기술연구소|Video signal processing method and apparatus using scaling process|

WO2021054691A1|2019-09-20|2021-03-25|엘지전자 주식회사|Transform-based image coding method, and device therefor|

WO2021054779A1|2019-09-21|2021-03-25|엘지전자 주식회사|Transform-based video coding method, and device therefor|

WO2021139572A1|2020-01-08|2021-07-15|Oppo广东移动通信有限公司|Encoding method, decoding method, encoder, decoder, and storage medium|

WO2021141443A1|2020-01-10|2021-07-15|엘지전자 주식회사|Transform-based image coding method and device for same|

WO2021169994A1|2020-02-25|2021-09-02|Mediatek Inc.|Methods and apparatus for secondary transform signaling in video coding|

WO2021238828A1|2020-05-27|2021-12-02|Beijing Bytedance Network Technology Co., Ltd.|Indication of multiple transform matrices in coded video|

法律状态:
2021-10-19| B350| Update of information on the portal [chapter 15.35 patent gazette]|

优先权:

申请号 | 申请日 | 专利标题

US201762474574P| true| 2017-03-21|2017-03-21|

PCT/KR2018/001320|WO2018174402A1|2017-03-21|2018-01-31|Transform method in image coding system and apparatus for same|

[返回顶部]